In order to formally understand the power of neural computing, we first needto crack the frontier of threshold circuits with two and three layers, a regimethat has been surprisingly intractable to analyze. We prove the firstsuper-linear gate lower bounds and the first super-quadratic wire lower boundsfor depth-two linear threshold circuits with arbitrary weights, and depth-threemajority circuits computing an explicit function. $\bullet$ We prove that for all $\epsilon\gg \sqrt{\log(n)/n}$, thelinear-time computable Andreev's function cannot be computed on a$(1/2+\epsilon)$-fraction of $n$-bit inputs by depth-two linear thresholdcircuits of $o(\epsilon^3 n^{3/2}/\log^3 n)$ gates, nor can it be computed with$o(\epsilon^{3} n^{5/2}/\log^{7/2} n)$ wires. This establishes an average-case``size hierarchy'' for threshold circuits, as Andreev's function is computableby uniform depth-two circuits of $o(n^3)$ linear threshold gates, and byuniform depth-three circuits of $O(n)$ majority gates. $\bullet$ We present a new function in $P$ based on small-biased sets, whichwe prove cannot be computed by a majority vote of depth-two linear thresholdcircuits with $o(n^{3/2}/\log^3 n)$ gates, nor with $o(n^{5/2}/\log^{7/2}n)$wires. $\bullet$ We give tight average-case (gate and wire) complexity results forcomputing PARITY with depth-two threshold circuits; the answer turns out to bethe same as for depth-two majority circuits. The key is a new random restriction lemma for linear threshold functions. Ourmain analytical tool is the Littlewood-Offord Lemma from additivecombinatorics.
展开▼